A Vision for Performing Social and Economic Data Analysis using Wikipedia's Edit History

نویسندگان

  • Erik Dahm
  • Moritz Schubotz
  • Norman Meuschke
  • Bela Gipp
چکیده

In this vision paper, we suggest combining two lines of research to study the collective behavior of Wikipedia contributors. The first line of research analyzes Wikipedia’s edit history to quantify the quality of individual contributions and the resulting reputation of the contributor. The second line of research surveys Wikipedia contributors to gain insights, e.g., on their personal and professional background, socioeconomic status, or motives to contribute to Wikipedia. While both lines of research are valuable on their own, we argue that the combination of both approaches could yield insights that exceed the sum of the individual parts. Linking survey data to contributor reputation and content-based quality metrics could provide a large-scale, public domain data set to perform user modeling, i.e. deducing interest profiles of user groups. User profiles can, among other applications, help to improve recommender systems. The resulting dataset can also enable a better understanding and improved prediction of high quality Wikipedia content and successful Wikipedia contributors. Furthermore, the dataset can enable novel research approaches to investigate team composition and collective behavior as well as help to identify domain experts and young talents. We report on the status of implementing our large-scale, content-based analysis of the Wikipedia edit history using the big data processing framework Apache Flink. Additionally, we describe our plans to conduct a survey among Wikipedia contributors to enhance the content-based quality metrics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Wikipedia Revision Toolkit: Efficiently Accessing Wikipedia's Edit History

We present an open-source toolkit which allows (i) to reconstruct past states of Wikipedia, and (ii) to efficiently access the edit history of Wikipedia articles. Reconstructing past states of Wikipedia is a prerequisite for reproducing previous experimental work based on Wikipedia. Beyond that, the edit history of Wikipedia articles has been shown to be a valuable knowledge source for NLP, but...

متن کامل

A Modified Directional Distance Formulation of DEA with Malmquist Index to Assess Bankruptcy

Bankruptcy in the same amount of time and history is very rampant and therefore the vision of the future can be prevented. Using data envelopment analysis (DEA) and malmquist index can precise evaluating of the performances of many different kinds of decision making units (DMU) such as hospitals, universities, business firms, etc. In this paper, we will modify directional distance formulation o...

متن کامل

Evaluating How the Islamic Republic of Iran Achieved its Vision Economic Goals by Designing and Computing a Composite Index and Using Monte Carlo Simulation Approach for Uncertainty Analysis

The Islamic Republic of Iran Vision aimed to direct development plans and yearly budgets. As one of the most important part, it includes 13 economic goals for directing economic route of the country for a twenty years period. After 15 years, evaluating how the country achieved to these goals is very important and unfortunately neglected. In this article, considering the different nature of goal...

متن کامل

Analysis of non-performing loans of banks with the regional economic approach: unbalanced panel data method

The high volume of non-performing loans and its trend over time is one of the main problems of banks and central banking in Iran. This high volume of non-current receivables and the destructive effects that this phenomenon has, has made it very important to identify and examine the role of the factors affecting it. In the present study, determinant factors from the Borrower, lender and regional...

متن کامل

Assessing Vision-Related Quality of Life of Older People in Tehran and Associated Factors

Background: Visual disorders in old age is one of the most important factors in decreasing quality of life of older people. This study aimed to access the vision-related quality of life of older people in Tehran and to examine some of the underlying factors. Materials and Methods: This was a population-based cross-sectional study among 566 older people aged 60 years and over, living in Tehran....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017